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Abstract — In this paper, we propose a sparse recovery al- 
gorithm called detection-directed (DD) sparse estimation using 
Bayesian hypothesis test (BHT) and belief propagation (BP). In 
this framework, we consider the use of sparse-binary sensing 
matrices which has the tree-like property and the sampled- 
message approach for the implementation of BP. 

The key idea behind the proposed algorithm is that the 
recovery takes DD-estimation structure consisting of two parts: 
support detection and signal value estimation. BP and BHT 
perform the support detection, and an MMSE estimator finds 
the signal values using the detected support set. The proposed 
algorithm provides noise-robustness against measurement noise 
beyond the conventional MAP approach, as well as a solution 
to remove quantization effect by the sampled-message based BP 
independently of memory size for the message sampling. 

We explain how the proposed algorithm can have the afore- 
mentioned characteristics via exemplary discussion. In addi- 
tion, our experiments validate such superiority of the proposed 
algorithm, compared to recent algorithms under noisy setup. 
Interestingly the experimental results show that performance of 
the proposed algorithm approaches that of the oracle estimator 
as SNR becomes higher. 

Index Terms — Noisy sparse recovery, sparse support detection, 
belief propagation, detection-directed estimation 



I. Introduction 

Sparse signal recovery in the presence of noise has been 
intensively investigated in many recent literature because any 
real-world devices are subject to at least a small amount 
of noise. We refer to such problems as noisy sparse signal 
recovery (NSR) problems. Let x £ M. N denote a sparse signal 
vector whose elements are sparsely non-zeros. Then, the NSR 
decoder observes a measurement vector z = <frxo + n G M M , 
where 3? € M MxAr is a fat sensing matrix (M < N); and we 
limit our discussion to zero-mean independently identically 
distributed (i.i.d.) Gaussian noise denoted by n € M. M . 

The NSR problem can then be defined as an Z - norm 
minimization problem as similarly done in 0Q,(3],|7),ID: 



then is sought by imposing sparsifying prior density on x as 
follows ll8l-lfT2l: 



x 0Jo 



argmm ||x|| 



s.t. ||4>x- 



< e, 



(1) 



where e is an error tolerance parameter. In general, the mini- 
mization task in ([TJ is NP hard; therefore, Zi-norm approaches 
have been developed and discussed as alternatives [1|-|6|. 
Among the Zi-solvers, the Dantzig selector (Ll-DS), proposed 
by Candes and Tao 0, and the ^-penalized least-square 
algorithm usually called LASSO [6| has been devised for the 
Gaussian noise setup. 

Bayesian approaches to NSR have also received attention, 
where the minimization task in ([T| is described as the max- 
imum a posteriori (MAP) problem, and the sparse solution 



x ,map = argmax /x(x|Z = z) 

X 

= argmax / z (z|X = x)/ x (x) 



(2) 



where /(•) is a probability density function (PDF). 

The task in |2]) is likewise computationally demanding; 
hence, many relaxed-Bayesian solvers have been devised 
according to various prior types and applied techniques, 
such as sparse Bayesian-learning (SBL) [9|-[13|,|20|, Ap- 
proximate minimum-mean-squared-error (AMMSE) estima- 
tion fl4l - lfT7l . and belief propagation (BP) with sparse sensing 
matrices (BP-SM) OjO-El A summary of the relaxed- 
Bayesian solvers is in Table [I] 

We are mainly interested in the BP-SM framework, in this 
paper, which has been investigated as a low-computational 
approach to solve linear estimation problems such as z = 
<I?xo + n. In this framework, the matrix $ is assumed to 
be a sparse matrix which has the tree-like property, and 
the statistical connection of xo and z is described with the 
bipartite graph model of The tree-like property ensures 
that the corresponding graph is asymptotically cycle-free 
I23l . l24l . ll33l . In addition, it is known that the vector finding 
problem can be decomposed to a sequence of scalar finding 
problems in the BP-SM framework, where marginal posterior 
for each scalar estimation is obtained by an iterative message- 
passing process. This decomposition have been explained by 
decoupling principle 1221 . 1241 . 

For implementation of BP, two approaches have been mainly 
discussed according to the message representation: 1) the 
sampled-message based BP [27|,[28| where the message is 
directly sampled from the corresponding PDF with a certain 
step size such that the message is treated as a vector, and 
2) the parametric -message based BP (also called relaxed BP) 
[23], [29 1, [30 1 where the message is described as a function 
with a certain set of parameters such as mean and variance. 
If the sampled-message approach is chosen, quantization error 
will be induced according to the step size. If the parametric- 
message approach is used, some model approximation must 
be applied for stable iteration at the expense of approximation 
error. 

As applications of the BP-SM approach, the low-density 
parity-check coding |[3T| - ||33l and the CDMA multiuser de- 
tection problems [22 1, [25 1, [26] are well known in addition to 
the NSR works 03 -El 

Based on the rigorous theoretical support for the BP-SM 
framework by Guo et al. J22|-[25| and Tanaka et al. Il26l . and 
motivated by recent NSR works by Baron et al. [18|,[19|, Tan 
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TABLE I 

Relaxd-Bayesian solvers for sparse estimation 



Algorithm type 


Sensing matrix type 


Prior type 


Utilized techniques 


BP-SM 


Sparse-ternary |18|,|19| 
Sparse-binary |20],|21| 


Gaussian-mixture [18|,|19] 
Hierarchical-Gaussian 1 20] , 1 2 1 ] 


Belief-Propagation (BP) |18|,|19| 
Expectation-maximization (EM), BP 1 20], 121] 


AMMSE 


Gaussian-random 1 14] 
Training-based 1151 
Random-unitary 1161 


Spike-and-slab 1141 
Gaussian-mixture fl5|,|16| 


Fast-Bayesian-Matching-Pursuit 1 14] 
Random-OMP (B) 
Closed-form-MMSE fl6l 


SBL 


Gaussian-kernel |9 
Gaussian-random 11 01- 1 1 31 
Uniform-random (12] 


Hierarchical-Gaussian |9 -| 1 1 ] 
Spike-and-slab 1131 
Hierarchical-Laplace [12 ] 


Relevance-vector-machine, EM |9 -|12 | 
Markov-chain-Monte-Carlo [13 



et al. 1201 . and Akcakaya et al. ETI . in this paper, we aim to 
develop a BP-SM type algorithm as an alternative for solving 
d2). We refer to the proposed algorithm as Detection-directed 
sparse estimation via Bayesian hypothesis test and belief 
propagation (BHT-BP). Differently from the works ifTSl-OTTl 
solving the MAP problem in |2]), the proposed algorithm takes 
a structure of detection-directed (DD) estimation which con- 
sists of a signal support detector and a signal value estimator. 
The support detector is designed using a combination of the 
sampled-message based BP and a novel Bayesian hypothesis 
test (BHT), and the signal value estimator behaves in minimum 
mean-square error (MMSE) sense. The DD-structure considers 
the common procedure of first using the measurements at hand 
to detect the signal support set. This detected support is then 
used in the model of the sparse signal, and an MMSE estimator 
finds the signal values as if the detected support set is in fact 
the correct set. 

This DD-methodology was originally investigated by Mid- 
dleton et al. for estimation of noisy signals [36], and in- 
cludes wide application areas, such as communication systems 
[37 1, [38 1 and speech processing l39l . For NSR, a line of 
previous works in the AMMSE group has independently 
studied similar structures by Schnitter et al, Elad et al, and 
Lee in ifMl-lfTTl.BOl. 

Then, the proposed algorithm achieves the following prop- 
erties: 

1) Providing robust signal support detection against the 
measurement noise, 

2) Removing quantization effect caused by the use of the 
sampled-message based BP. 

Here, the "oracle estimator" implies the estimator which has 
the perfect support knowledge. 

The combination of BHT and BP enables robust detection 
of the signal support against measurement noise, which was 
partially introduced in our previous work [43 1. In the support 
detection, the sampled-message based BP provides marginal 
posterior for each scalar problem according to the decoupling 
principle, and the BHT-process then detects the supportive 
state of each scalar element by measuring the inner prod- 
ucts between the marginal posterior and reference functions 
composed of the prior density. This BHT-detector utilizes the 
signal posterior information more efficiently than the conven- 
tional MAP-approaches [ 18 1-[21 1. When the measurements are 
noisy, density spread occurs in marginal posteriors, leading 
difficulty in making correct decisions via the MAP-approach. 
In contrast, the BHT-based support detector compensates such 
a weakpoint by scanning the entire range of the posterior. 
Such hypothesis-based support detectors have been discussed 
in wavelet domain denoising problems [41|,[42|; however, 
they were using thresholding techniques to sort out significant 



wavelet coefficients, different from our work which uses the 
inner product. 

In addition, we emphasize that the proposed algorithm 
effectively removes quantization effect coming from the use 
of the sampled-message based BP. The quantization effect 
limits performance of the MAP-approach in both the signal 
value estimation and the support detection. In the proposed 
algorithm, the use of the DD-structure makes the performance 
independent of the message sampling in the BP. In addition, 
we eliminate the quantization error in the support detection 
by applying the minimum value of the signal on its support, 
denoted by x m i n , to the reference functions. The importance 
of x m i n in the NSR problems was theoretically highlighted 
by Wainwright et al. in 1341 .135 1 where they showed that the 
perfect support recovery is very difficult even with arbitrarily 
large SNR if x m i n is very small. Hence, we regulate x m i n 
in our signal model, reflecting the knowledge of x m i n in the 
BHT-detection. To the best of our knowledge, we have not seen 
the reflection of x m i n in practical algorithms in the sparse 
recovery literature, and surprisingly, this reflection enables 
performance of the proposed algorithm to approach that of 
the oracle estimator as SNR increases. 

The computational complexity of the proposed algorithm is 
0(N log N + KM) (if the cardinality of the support is fixed to 
K), which includes an additional cost 0{KM) owing to the 
BHT-process to the cost of BP 0(N log N). Nevertheless, the 
cost of the proposed algorithm is lower in practice since the 
proposed algorithm can catch the signal support with smaller 
memory size than NSR algorithms only using the sampled- 
message based BP. 

The remainder of the paper is organized as follows. We 
first briefly review a line of the BP-SM algorithms, and then 
make a remark for the relation between the BP-SM solvers 
and approximate message passing (AMP)-type algorithms in 
Section II. In Section III, we define our problem formulation. 
Section IV provides precise description for the proposed 
algorithm and Section V provides exemplary discussion to 
explain and support strengths of the proposed algorithm. In 
Section VI, we provide experimental validation to show the 
advantage of the proposed algorithm and compare to the recent 
algorithms. Finally, we conclude the paper in Section VII. 

II. Related Works 

In this section, we provide a brief introduction to the previ- 
ous works in the BP-SM algorithms [18|-[21|. The common 
feature in these algorithms is the use of BP in conjunction with 
sparse sensing matrices, to approximate the signal posterior, 
where a sparsifying prior is imposed according to the signal 
model. In addition, we make a remark to distinguish the BP- 
SM works from AMP-type algorithms. 
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A. BP-SM Algorithms 

Baron et al. for the first time proposed the use of BP 
to the sparse recovery problem with sparse sensing matrices 
ED, ED. The algorithm is called CS-BP. Signal model of CS- 
BP is a compressible signal which has a small number of large 
elements and a large number of near-zero elements. The author 
associated this signal model with two-state mixture Gaussian 
prior, given as 

N 

/x(x) = Y[qAf(xi;0,a 2 Xl ) + (1 - ^(a*; 0, a 2 Xo ), (3) 

»=1 

where q £ [0, 1) denotes the probability that an element has 
the large value, and o~ Xl ^> o~x - Therefore, the prior is fully 
parameterized with <jx , cr Xl , and q. CS-BP performs MAP or 
MMSE estimation using the signal posterior obtained from BP, 
where the authors applied both the sampled-message and the 
parametric-message approaches for the BP-implementation. 
The recovery performance is not very good when measurement 
noise is severe since the CS-BP was basically designed under 
noiseless setup. 

Tan et al. proposed an algorithm under the BP-SM 
setup called BP-SBL [20 1. This work is based on the 
SBL-framework [9|-[12| which uses two-layer hierarchical- 
Gaussian prior models given as 

/x(x|a,6) = TT / Af(a: i ;0,Tr 1 )fr(T i |a i ,6i)dT i) (4) 
t =i Jo 

where /r(r|a, b) is the hyper-prior following Gamma dis- 
tribution with its parameters di,bi. At each iteration, the 
parameters a, , 6, of the prior are estimated using expectation 
maximization (EM). Therefore, the posterior for the signal es- 
timation is iteratively approximated from the prior. The authors 
applied the BP-SM setup to reduce the computational cost of 
EM. BP-SBL is an algorithm using parametric-message based 
BP where every message is approximated to be a Gaussian 
PDF which can be fully described by its mean and variance. 
In addition, BP-SBL is input parameter-free, which means 
this algorithm is adaptive to any signal models and noise 
level since EM estimates the parameters associated with any 
given models during the iteration. However, such parameter 
estimation will not be accurate when noise is severe; therefore, 
denoising ability of BP-SBL is limited when measurements are 
highly corrupted. 

Most recently, Akcakaya et al. proposed SuPrEM under a 
framework similar to BP-SBL which uses a combination of 
EM and the parametric-message based BP ETIl . The main 
difference from BP-SBL is the use of a specific type of hyper- 
prior called Jeffreys' prior fjiji) = Vr € [Ti, oo]. The 
use of Jeffreys' prior reduces the number of input parameters 
while sparsifying the signal. Therefore, the prior is given as 

/x(x) = n/ ^{xi-AT^fjiT^dn. (5) 
t=i Jo 

The sensing matrix used in SuPrEM is restricted to a sparse- 
binary matrix which has fixed column and row weights, called 
low-density frames. They are reminiscent of the regular LDPC 
codes QTIl . In addition, the signal model is confined to K- 
sparse signals consisting of K nonzeros and N — K zeros 
since SuPrEM includes a spasifying step which choose the K 




Matrix sparsity (%) 

Fig. 1 . Experimental success rate of sparse recovery via the standard AMP 
under the use of sparse-Gaussian matrices. The success rate is plotted as a 
function of the matrix sparsity defined as percentage of nonzero entries in the 
sensing matrix. We record a success if the MSE, defined in \17\ , is below 
10 — 3 . Reconstruction signal length is N = 1024 and 500 iterations were 
used. 

largest elements at each end of iteration. The noise statistic 
is an optional input to the algorithm. Naturally, if the noise 
information is available, SuPrEM will provide an improved 
recovery performance. 

B. Relation to AMPs 

It is interesting to associate the algorithms in the BP-SM 
group to the class of AMP algorithms, which was originally 
invented by Donoho et al. in iPfll . RBTl . analyzed and refined 
by Bayati et al. [46 1 and Rangan [47], because both types of 
the algorithms utilize BP. Performance of the AMPs coincides 
with that of the Zi-solvers under the large limit assumption 
(N — > oo) in the noiseless case [44|,[46|. Furthermore, in a 
recent result by Donoho et al. showed that performance of the 
AMPs is also equivalent to LASSO [6] under the noisy setting 
with appropriate parameter calibration [48|. 

The AMPs work well when sufficient density of the sensing 
matrix is maintained as TV —> oo since the AMPs were 
developed on the basis of an approximation of BP-messages 
by the central limit theorem [45 1. Therefore, if a sparse matrix 
is applied to the AMPs, the message approximation does not 
work, then the decoder will fail in the recovery. We validate 
this claim by our own experimental result shown in Fig[TJ 
where we simulated the recovery success rate of the standard 
AMP [44 1 without additive noise as a function of the matrix 
sparsity defined as percentage of nonzero entries in the sensing 
matrix. The recovery of the AMP are not successful when 
the sensing matrix is sparse ( typically less than 10% matrix 
sparsity) regardless of the number of measurements M, as 
shown in an example of Fig[T] Namely, the AMPs recover 
sparse signals well with dense matrices at the expense of low 
computation 0{MN log N), but it does not enjoy the benefits 
from the use of sparse matrices. 

III. Problem Formulation 

The goal of the proposed algorithm is to recover the object 
signal xo given the knowledge of 3? and noisy measurements 
z as following 

z = <J>x + n, (6) 
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Fig. 2. DD-estimation in the proposed algorithm 



where we consider the use of a fat sparse-binary sensing 
matrix 4> 6 {0, l} MxJV (M < N) which has very low matrix 
sparsity (typically less than 10% matrix sparsity) and the tree- 
like property. We regulate the matrix sparsity using the fixed 
column weight L since this regulation enables the matrix <& to 
span the measurement space with basis having equal energy. 
For the noise distribution, we assume i.i.d. zero-mean Gaussian 
density such that the noise vector n G R M is drawn from 
M(0,a%I). 

In the remainder of this section, we introduce some neces- 
sary concepts, such as signal model and graph representation 
of for our framework, and then discuss our approach to 
solve this NSR problem. 

A. Graph Representation of $ 

Bipartite graphs effectively represent linear systems with 
sparse matrices such as the matrix €>. Let V := {l,...,iV} 
denote the set of indices corresponding to the elements of 
the signal vector, Xo = [xo,i, £o,iv]> and C := {l,...,M} 
denote the set of indices corresponding to the elements of the 
measurement vector, z = [zx, Zj^]. In addition, we define 
the set of edges connecting V and C as £ := € V x 

C | (fiji = 1} where <j)ji is the (j, i)-th element of €>. Then, a 
bipartite graph Q = (V,C,£) fully describes the neighboring 
relation in the linear system. For convenience, we define the 
neighbor set of V and C as N v (i) := {j e C\(J,i) € £} 
and N c (j) := {i G V\(j,i) G £}, respectively. Note that 
|iVy(i)| = L under our setup on <I>. 

B. Signal Model 

Let xo G M. N denote a sparse vector which is a deterministic 
realization of a random vector X. Here, we assume that the 
elements of X are i.i.d., and each belongs to the support 
set, denoted by supp(X), with a rate q E [0, 1]. To indicate the 
supportive state of JQ, we define a state vector S(X) where 
each Si € S(X) is defined as 



5, = S(Xi) = 



1. 
0, 



if i € supp(K) 
else 



for all i€V. (7) 



Therfore, the signal sparsity is given as K = ||S||q. For the 
signal values on the support, we will deal with two cases, 
which are 

1) Gaussian signals: X tesuppm ~Af(0,a Xi ), 



2) Signed signals : X l<£suppix .) 



2 u -(?x 1 



2°+fXi> 



where S T denote the delta function peaked at r. Namely, in the 
first case, the values on the support are distributed according to 



an i.i.d. zero-mean Gaussian density with variance o\ x , and in 
the second case the magnitude of the values is fixed to ux x an d 
the sign follows Bernoulli distribution with probability 1/2. 
In addition, for the Gaussian signal, we regulate the minimum 
value x m i n on the support as a key parameter of the signal 
model, such that 



< 



for all i G sitpp(xo), 



(8) 



Indeed, Wainwright et al. emphasized the importance of x m i n 
in the NSR problems [ 34 1, 1 35 1. They stated that if x m i n is very 
small, success of the noisy sparse recovery is very difficult 
even with arbitrarily large SNR. Namely, if the signal contains 
at least such a very small element belonging to the support, 
the recovery can be failed regardless of SNR level since the 
small element can be buried even by negligible noise. 

In the Bayesian framework, the prior density takes a role of 
pursuing the sparsest solution from infinitely many solutions 
of the underdetermined system. Therefore, the proper choice 
of the sparsifying prior according to the signal model is highly 
important. According to our signal model, we consider spike- 
and-slab densities as our prior. The prior for an element Xi 
is given as 



(9) 



fx(x) = qfx(x\S = 1) + (1 - q)fx(x\S = 0) 
= qAf(x;0,a 2 Xl ) + (1 - q)5 Q . 

This type of the priors has been widely used in modeling 
of strictly sparse signals 1131 . 1141 . 1531 since the prior can 
simply characterize the signals with ax 1 and q, as well as 
easily associate the signal model with the other models such 
as hidden Markov chain and wavelet model. In addition, we 
note that the spike-and-slab prior is a particular case of two- 
state Gaussian mixture in ^ as a Xa — > 0. 

C. Solution Approach 

The approach of the DD-estimation is motivated by the 
naive MMSE sparse estimation which was also reported in 
[14HT6]], as follows: 



x 0,MMSE — 



s£{0,l} J 



E[X|S,Z = z] -Pr{S|Z = z}, (10) 



We note that (|T0]> is a weighted sum of the signal value 
estimation E[X|S, Z = z] over 2 N possible supports given 
noisy measurements z. Therefore, we separate the signal value 
estimation and the 2 N support searching in ( fT0] > as the first 
step for its relaxation. Then, the signal value estimation can 
be represented as 



x 0,BHT-BP — E [X|S — S, Z 



(ID 



by assuming an ideal support detector for s. The calculation 
of ( fTT| is simple because it is well known as the convectional 
linear MMSE estimation, expressed as 



1 



1 



(12) 



^ rr'2 * * 

7 X ± a N / U N 

where $5 G {0,l} AIxK as a submatrix of $ that contains 
only the columns corresponding to the detected support s, and 
x o,s S K K as a vector that includes only the nonzero elements 
from x . In addition, we know that this MMSE estimation is 
optimal when the signal values on the detected support Is are 
Gaussian distributed [491. 
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For the part of the 2 N support search, we start from an 
exhaustive detection, described as 

s = arg max Pr|S = s|Z = z). (13) 

se{o,i} N 

To solve ( fT3"l l efficiently, we decompose the state vector 
detection to N scalar state detections based on the decoupling 
principle [22|,[24|, and apply binary scalar MAP-detection for 
each scalar state. Then, the binary scalar detection can be 
optimally achieved by a hypothesis test [50], given as 



PriSt = 1IZ = z} H} 



(14) 



where Hq : S(Xi) = and Hi : S(Xi) = 1 are two possible 
hypotheses. This support detection approach simplifies the 2 N 
exhaustive search in ( p"3j ) to N hypothesis tests. Here, we note 
that the support detection in ( fT~4] > performs with an independent 
decision rule from the value estimation in ( [12) . Therefore, this 
DD-approach to ( fTO) is reasonable since the Bayesian risks of 
the support detection and the signal values estimation can be 
independently minimized (36). The overall flow of the DD- 
estimation is depicted in Fig[2] 

In order to justify our solution approach, we will provide ex- 
perimental validation compared to the recent NSR algorithms, 
as a function of SNR defined as 



SNR := 101 



E\\<f>x 



=.10 



oil 
Ma% 



dB 



(15) 



Performance of the support detection is evaluated pairwise 
state error rate (SER), defined as 



SER := Fr{s{x 0ti ) + € V}, 



(16) 



and the overall performance of the signal recovery is measured 
in terms of normalized mean square error (MSE), defined as 



MSE 



|x 



1 1 2 

x oll 2 

II 2 ' 
x 0,s|| 2 



(17) 



IV. Proposed Algorithm 



The proposed algorithm straightforwardly provides the es- 
timate xo via the MMSE estimation in ( [T2) once given the 
detected support set s from ( fT3) ). In addition, we state that the 
detected support set can be elementwisely obtained from ( fB) , 
on the basis of the decoupling principle. Therefore, efficient 
design of the scalar state detector in ( fl4"l ) is very important in 
this work. In this section, we explain the implementation of 
the scalar state detector based on the hypotesis test given in 
( fT"4") , using the combination of the sampled-message based BP 
and BHT 



A. Sampled-message based Belief Propagation 

BP provides the marginal posterior for the hypothesis test in 
( fT4) for each element. Since the signal is real-valued, each BP- 
message takes the form of a PDF, and the BP-iteration consists 
of density-message passing. We provide a brief summary of 
density-message passing in Appendix I. To implement the BP 
passing density-messages, we consider the sampled-message 
approach which has been discussed by Sudderth et al. [27|, 
and Noorshams et al. ||28ll . For the sparse recovery, Baron et 
al. applied the approach in [18|,|19|. The main strength of 
the sampled-message approach is adaptivity to various signal 
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Fig. 3. Graph representation of the sampled-message based BP (JV = 
6, M = 4, L = 2), where igl is the operator for the linear convolution of 
PDFs and <5 Zj indicates the delta function peaked at Zj . 



models. In addition, it shows faster convergence than the 
parametric-message approach [23 1,|29],|30] if the sampling 
step size is sufficiently fine. The reason is that the sampled- 
message approach does not use any model approximation or 
reduction for the message representation during the iteration. 
Based on ( f3"3"] l and ( |3"7j ) from Appendix I, we will provide a 
practical message update rule of the sampled-message based 
BP for the proposed algorithm. 

In our implementation, we set the sampling step T s on the 
basis of the three sigma-rule |52|, given as 



2 • 3g Xl 



(18) 



where Nd is the number of samples for a density-message. 
Hence, we define the sampling process of the PDFs as 



Samp [f{x)\ := f(mT s - 3a Xl ) 
= f[m] for m = 



N d -1, 



(19) 



where Samp[-] denotes the sampling process. Hence, the 
density-messages are treated as vectors with size Nd in this 
approach. 

€ [0, 1]^ denote a sampled density-message 



Let a 



L 

from Xi to Zj, called the signal message;, and bj 
[0, l] Nd is the message from Zj to JQ, called the measurement 
message. The signal message &i^j [m] includes information on 
the marginal posterior fx f (x\Z = z), being obtained from ( |3"3j ) 
by simply replacing the measurement densities, fz k (z\Xi = 
x^, with the measurement messages of the previous iteration 
except the intrinsic information. That is, 



[m\ = rj 



fx[ 



m x 



n 



k£N v (i)\{j} 



(20) 



for all € £, where rj[] is the normalization function to 
make X) a «^j[ TO ] = 1- Similarly, the measurement message 
bj_j,j[m] includes information on the measurement density 
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faW — fx ( x) 

pnor knowledge. 



are reference functions consisting of the 



fzj (z\Xi = xi), being obtained from the expression of ( [37] i by 
replacing the associated marginal posteriors, f Xk (x\Z = z), 
with the signal messages, that is, 



S V ' 

— Samp[N'(n;Zj ,0^)] 



<8> 

K keN c (j)\{i} 



(21) 



where ® is the operator for the linear convolution of PDFs. 
In addition, under the Gaussian noise assumption, we know 
5 Zj ® fNj{ n ) — N(n; zj, a 2 N ) in ( |2"Tj > based on ( |37| i. Here, 
we note that the measurement message calculation utilizes the 
noise statistics distinctively from that of the standard CS-BP 
[see (7) in |[T9l l. This improvement remarkably enhances the 
recovery performance in the low SNR regime even though the 
noise statistic loses its effect with sufficiently high SNR. 

The convolution operations in ( f2T| can be efficiently com- 
puted by using the Fast fourier transform (FFT). Accordingly, 
we express for the measurement message calculation as 



J 7 [fAr[m;Zj,(T%}] 



k 



(22) 



where denotes the FFT operation. Therefore, for efficient 
use of FFT, the sampling step T s should be appropriately 
chosen such that is two's power. In fact, the use of the FFT 
brings a small calculation gap since the FFT-based calculation 
performs a circular convolution that produces output having 
a heavy tail. The heaviness increases as the column weights 
of <I> increases. However, the difference is can be ignored, 
especially when the messages are bell-shaped densities. 

Finally, an sampled approximation of the marginal posterior 
fxi[m\Zi = z] for in = ~ Nd — 1 is obtained after a 
certain number of iteration I* via the update rule in (pO^pl), 
as follows: 



Samp[/ X4 (x|Z = z)] » f Xj [m\Z = z 
= V 



x[m] x J] b'-nM 
jewv(*) 



(23) 



B. Bayesian Hypothesis Test for Support Detection 

In order to perform the hypothesis test in (TT4b, the decoder 
needs to calculate the probability ratio p^|g'~"|^ -j. By factor- 



izing over Xi, the hypothesis test becomes 



PijSj = l|Z = z} 
Pr{S' l = 0|Z = z} 
_ JPi{S i = l\Z-- 



z)dx Hi 
% 1. 



JPt{S z = 0|Z - z,X z }f Xt (x\Z = z)dx t 



(24) 



In practice, we replace the marginal posterior fxi(x\Z = z) 
with the sampled marginal posterior f Xi [m\Z = z] from ( |2"3"j ) 
for discrete signal processing, which is provided in Algorithm 
[T] However, we use notations of the continuous domain in the 
description of this hypothesis process. Given Xi = Xi from 
BP, the given measurements z do not provide any additional 
information on Sf, hence, Pr{5,|Z = z,Xi} — Pr{Si\Xi} 
holds true. Using such a fact and the Bayesian rule, the test 
in ((24l) is rewritten as 



J ri (x)f Xt (x\Z = *)dx Hi Pr{S = 0} 
/ r (x)f Xi (x\Z = z)dx $ Pv{S = 1} ' 



(25) 



where ^|^> = Ss=H 

Pr{S=l} q 

functions consisting of the prior knowledge defined as 



, and r (x),ri(x) are reference 



r (x) := 



fx(x\S = 0) 
fx(x) ' 



ri (x) 



fx(x\S = l) 
fx(x) ■ 



(26) 



The process to calculate the probability ratio in ( |25[ l is 
described as a block diagram in Fig|4] This process has a 
similar structure to matched filtering in communication sys- 
tems, where the detector determines supportive state of Xj by 
measuring inner products between the marginal posterior and 
reference functions. In addition, we emphasize that this BHT- 
based detector is only compatible with the sampled-message 
based BP because the BHT-process requires full information 
on the signal posterior which cannot be provided through the 
parametric-message based BP. 



C. Computational Complexity 

In the sampled-message approach, the density-messages 
in BP are vectors with size Nd- Therefore, the decoder 
requires O(LNd) flops to calculate a signal message a^^j 
and 0( N ^ d log Nd) flops for a measurement message bj_yj 
per iteration, where L denotes the column weight of the sparse 
sensing matrix #. In addition, the cost of the FFT-based convo- 
lution is 0(Nd log Nd) if we assume the row weight is NL/M 
using average sense. Hence, the per-iteration cost of the BP- 
process is 0{NLN d + logN d ) a 0{NLN d \ogN d ) 

flops. For the hypothesis test, the decoder requires 0(N d ) flops 
to generate the probability ratio. The cost for the hypothesis 
test is much smaller than that of BP; therefore, it is ignored. 
For the MMSE estimator to find signal values, the cost can be 
reduced upto 0(KM) flops by applying QR decomposition 
||5T1l . Thus, the total complexity of the proposed algorithm 
is O (I* x NLN d log N d + KM) flops and it is further sim- 
plified to 0(1* x N + KM) since L and N d are fixed. In 
addition, it is known that the recovery convergence via BP 
is achieved with O (log AT) iterations [19|,[32|. Therefore, we 
finally obtain 0(N log N + KM) for the complexity of BHT- 
BP. We note here that the complexity of BHT-BP is much 
relaxed from that of the naive MMSE sparse estimator 0(2 ), 
given in ( fTO] ). 
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Algorithm 1 BHT-BP 



Inputs: Noisy measurements z, Sensing matrix Sparsfying 
prior fx(x), Noise statistic /jv(n), Sampling step T s , 
Number of the BP-iterations I*. 

Outputs: Reconstructed signal xo,bht-bp, Detected state vec- 
tor s. 

1) Sampled-message based BP: 

set b£°j = 1 for all (i,j)e£ 
for I = 1 to I* do 
for i = 1 to N do 



set a! 



I 

i-tj 



[m\ = r\ 



end for 

for j = 1 to M do 

set b^Jm] 

end for 
end for 

for i = 1 to N do 

set /x«[m|Z = z] = r] 



fx [m] x n 

keN v {i)\{j} 



® a k-»i[-"*] 

,fceJv c (j)\W 



/xH x n bf^fm] 

j£JV v (i) 



end for 

2) BHT for Support Detection: 

set 7 = q/(l - g) 
for i = 1 to do 

if S^M/xJmjZ^j (l-^) g. = 

2_) m roH/ij [m|Z=z] g 4 

else set Sj = 
end if 
end for 

3) Signal Value Estimation: 

set xq.bht-bp = E [X|S = s, Z = z] 



V. Exemplary Discussion for Proposed Algorithm 

One can argue the DD-structure of the proposed algorithm 
is an abuse since the marginal posteriors from the BP already 
provides full information on the signal. Namely, this means 
that fx t [m\Z — z] in ( |23] l contains the perfect knowledge 
for detection and estimation of Xo,i- Yes it is true, but our 
claim is that the MAP-based algorithms [18|,|19| which solve 
the problem in (|2]l are not utilizing the full knowledge of the 
marginal posterior. 

In this section, we show two weakpoints of the MAP- 
approach which finds each scalar estimate xo t i only using the 
peak location of the marginal posterior, through examples. We 
then discuss how the proposed algorithm can remedy such 
problematic behavior of the MAP, and utilize the posterior 
knowledge more efficiently. We claim that the proposed algo- 
rithm has strength in two aspects as given below: 

1) Robust support detection against additive noise: The 
BHT-detector in the proposed algorithm correctly 
catches the supportive state given i E supp(x ) even 
under severe additive noise. 

2) Removing quantization effect caused by the sampled- 
message based BP: The quantization effect degrades 
performance of the MAP-based algorithms in both the 
signal value estimation and the support detection. In 




The spike at x=Q is higher 
than the peak of true x oi 
by noise effect 



Fig. 5. Example of a marginal posterior fx- = z) in BP-iteration under 
severe noise (SNR=10 dB), where xq i originally has nonzero (xo,i = 6.7), 
the minimum value is x mirl = 1.25. 




Fig. 6. Example of a marginal posterior fx (x\Z = z) over SNR where xq ; 
originally has nonzero (a;rj,i = 6.7), the minimum value is x m i n = 1.25, 
and we use 5 BP-iterations to approximate fx- (x\7i = z). 



the proposed algorithm, the DD-structure removes the 
quantization effect from the signal value estimation, and 
the BHT-detector reduces the chance of misdetection of 
the supportive state given i £ supp(x ) by applying the 
knowledge of x m i n to the reference functions. 

A. Robust Support Detection against Additive Noise 

The additive noise spreads probability mass in the marginal 
posterior fxi(x\Z = z), leading to difficulty in the supportive 
state detection via the MAP-approach given i £ supp(xo). For 
example, Fig|5] shows marginal posterior obtained from severe 
noisy measurements (SNR=10 dB) via BP-iteration, where the 
signal value is originally 2^,0 = 6.7, hence i £ supp(xo). 
We note that the posterior is approximately composed of a 
zero-spike and a slab-density with certain probability weights. 
In Fig j5] it is shown that the center of the slab-density is 
moving to the true value x — 6.7 over the iterations, but after 
5 iterations the mass at x = 6.7, i.e., fxi(x = 6.7|Z = z), is 
still smaller than that of the zero-spike. The reason is that the 
noise effect spreads the slab-density over near values, making 
the peak at x — 6.7 to be blunt. We call this spreading effect 
as density spread. 
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True value: 1.3 
Quantized value: 1.25 

i 

+ 





True value: -2.3 -3.8 7.4 

Quantized value: 0.25 -2.25 -3.75 7.5 

Fig. 7. Example of measurement message passing under quantization caused 
by the sampled-message based BP where T s = 0.25 

Figj6] more clearly describes the density spread in the 
marginal posterior according different SNR levels. When SNR 
level is sufficiently high (see cases more than SNR=20 dB 
in Fig|6j, the MAP-approach can successfully detect the 
supportive state since probability mass is concentrated on the 
center of the slab-density such that fx t (x = 6.7|Z = z) > 
fx t (x — 0|Z = z). However, when SNR is low (see the 
line of SNR=10 dB in Figj6]), the MAP-approach will fail 
in the detection because the zero-spike fxi(x — 0|Z = z) 
becomes the highest peak in the marginal posterior such that 

fxM = 6 - 7 l z = z ) < fx t ( x = °l z = z ) due to me 
density spread. The density spread does not cause errors given 

i £ supp(xo) since in this case the center of the slab-density 

stays at x — during the BP-iteration, regardless of the noise 

level. 

In contrast, the BHT-detector in the proposed algorithm, 
decides the supportive state by considering the density spread 
effect. In ( |2~5j ), the inner products between ri(x),r (x) and 
fxi(x\Z — z) measure portions of the marginal posterior 
corresponding to i E supp(xo) and i ^ supp(xo) respectively. 
Since the inner products are associated with the entire range 
of the x-axis rather than specific point-mass, the BHT-detector 
can decide the supportive state by incorporating all spread 
mass due to noise. Therefore, the BHT-detector can success 
in the detection even in the SNR=10 dB case of Fig|6] This 
example supports that BHT-detector has ability to detect the 
signal support more robustly against noise than the MAP- 
approach. 

B. Removing Quantization Effect caused by the Sampled- 
message Based BP 

1) In signal value estimation: Under the use of the 
sampled-message based BP, quantization error is inevitable 
in the signal value estimation. When we apply rounding for 
the quantization, the MSE error is uniformly distributed with 
zero-mean and variance according to the sampling step size 
T s , given as 

||2 T 2 S 

E \\Qts [^iGsupp(X)] - ^iGsupp(X) L = Y2" 

= (1^) 2 /12 , (27) 



Fig. 8. Calibrated reference functions for Bayesian hypothesis test according 
to x m i„ where c = 1/6. 



where Qt s [■] denotes the quantization function with T s . 
Therefore, MSE performance of the MAP-based algorithms 
with the sampling cannot exceed the limit given by ( f27| even 
if the level of additive noise is extremely low. To relax this 
limit, we can increase memory size Nd or confine the range 
of signal value by decreasing ox x - However, such methods 
are impractical and restrict flexibility of signal models in 
the recovery. The DD-structure, described in ( p~3] > and (JTTJ, 
removes this weakpoint since the signal values are evaluated 
using an MMSE estimator in ( fT2] > independently of Nd once 
the detected support is given. Furthermore, the probability ratio 
for the BHT-detection can be generated from sampled marginal 
densities /x 4 [m|Z = z] with small Nd- 

2) In support detection: The quantization effect also can 
cause detection error of the supportive state given i £ 
supp(xo), limiting performance of the MAP-approach in the 
high SNR regime. In order to explain such cases, we provide 
an example of the measurement message-passing with respect 
to Xo,i = z\ — (xo.i + xo,2 + ^0,3) as shown in Fig|7] In this 
example, we try to find the value of xqj given xo,i = —2.3, 
xq,2 — —3.8, xo,3 = 7.4, zi — 1.3, and i ^ supp(xo), 
under noiseless setup. Note that in practical operation of the 
algorithm, the value of each element corresponds to the peak 
location in the a>axis of each density-message. In the sampled- 
message based BP, the message sampling causes quantiza- 
tion of the values such that we have Qt 3 [xq\\ — —2.25, 
Qt s Ned = -3.75, Q T . [a:o,3] = 7.5, QtAzi] = 1.25 with 
the step size T s = 0.25 where Qt s [■} denotes the quantization 
function. In this case, we can simply infer that this measure- 
ment message passes a corrupted value a?o,j = 0.25 which is 
not matched with i £ supp(xo). If most of the measurement 
messages to x o i has such corruption, the marginal posterior 
of xqj will have the peak at an erroneous location, leading 
to detection error based on the MAP-approach. The same 
event can occur given i € supp(x ). However, the effect is 
not significant because the case of i e supp(xo) does not 
bring about the support detection error. In addition, this type 
of errors is remarkable particularly in the high SNR regime 
since such corruption is covered by additive noise when SNR 
is low. Accordingly, the support detection performance of the 
MAP-approach shows an error-floor in the high SNR regime. 

Here, we aim to show how the proposed BHT-detector 
utilizes the prior knowledge of x m i n to remove the error- 
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TABLE II 

List of sparse recovery algorithms in the experiment 



Algorithm 


Complexity 

tr\v rprriVPTV 


Type of 3* 


Use of no- 

kp statistic 


BHT-BP 


V_y I I V 1U £L 1 V IV 1VJ I 


en n rep -hi Tin tv 


Yes 


Standard 


0(N log AT) 


sparse-binary 


No 


CS-BP 








CS-BP-NS 


0(ATlog AT) 


sparse-binary 


Yes 


SuPrEM 


0(A"logA r ) 


sparse-binary 


Yes 


BCS 


0(AfA' 2 ) 


dense-Gaussian 


No 


Ll-DS 


f2(AT 3 ) 


dense-Gaussian 


No 



floor of the MAP approach. As we mentioned in Section III- 
A, we consider the signal model which has x m i n as a key 
parameter, according to the result of Wainright et al. [34|,[35|. 
The Wainright' result revealed that the regulation of x m - m is 
imperative for perfect support recovery under noisy setup. This 
means that we can have additional prior knowledge to the 
sparse recovery. From (|8), the knowledge of x m i n tells us 
that there exist no nonzero elements which have value within 
\xo,i\ < Xmin- The BHT-detector reflects x m i n to calibrate the 
reference functions. Rather than the functions given in |26|, 
we use its calibrated version given as 

, , fx(x\S = 0;x m j n ) 

J X \ $i Xmin ) 

ri(x,x min ) := ^j-^- \, (28) 

J X \X, Xmin ) 

to improve the performance for the Gaussian signal model, 
where 

fx{x\S = 0;x min ) = Af(x;0,cx min ), 

fx{x;x mm ) = qf x (x\S = 1) + (1 - q)f x (x\S = 0;x min ), 

with a constant c G K. For signed signals, we simply use 

Xmin — C ■ 

This calibration depresses the weight of r\(x, x m i n ) and 
puts more weight to r (x, x m i n ) over |o;o »| < x min , as shown 
in FigjH] This implies that the detector excludes elements 
within \Xo,i 1 < x m i n from the signal support set. Therefore, 
we can eliminate the misdetection cases given i ^ supp(x. ) 
such as the example in Fig j7] removing the error-floor in the 
high SNR regime effectively. 

VI. Experimental Validation 

We support our investigation for the proposed algorithm 
via experimental validation. We performed two types of the 
experiments for this validation as given below: 

1) SER performance over SNR: To support our claims for 
support detection based on BHT, we simulate the SER 
performance, described in ( fTo*! , as a function of SNR, 
compared to that of the MAP-approach used in CS-BPs, 

2) MSE comparison over SNR: To compare the recovery 
performance of the recent NSR algorithms listed in 
Table |il| and the oracle estimator, in the presence of 
noise, we examine MSE performance, described in ( |17) , 
as a function of SNR. 

Here, for SuPrEM, BCS, and Ll-DS, we obtained the source 
codes from each author's webpage. For CS-BP, we imple- 
mented it using the sampled-message based approach and the 
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Fig. 9. SER performance for support detection over SNR for A' = 1024, 
q = 0.05, M/N = 0.5, a Xl = 5, x min = 1.25, and N d = 256 (T s RJ 
0.1172). 



spike-and-slab prior, making the algorithm into two different 
versions: 1) the standard CS-BP which is the original by 
the corresponding papers lfTo1,lfl9l, and 2) CS-BP-NS which 
utilizes the noise statistic in the BP-iteration. In implementa- 
tion of the sampled-message based BP for CS-BPs and BHT- 
BP, we used = 256 such that the sampling step for the 
density-messages is T s s» 0.1172 with ( [IS) , For the choice 
of the sensing matrix, BHT-BP and CS-BPs were sparse- 
binary matrices with L = 4 such that the matrix sparsity is 
7.8%, as discussed in Section III-A, . Ll-DS and BCS are 
performed with the standard Gaussian matrix having the equal 
column energy as the sparse-binary matrix, for fairness, i.e., 

\\<l>j,Gaussian\\ 2 = |j ^Sparse || 2 = L - In addition, SuPrEM 

worked with a sensing matrix generated from a low-density 
frame |2TJ. 

In these experiments, we examined the two types of signal 
models defined in Section III-B, Gaussian signals and signed 
signals, with N = 1024, q = 0.05, and a Xl = 5. For the 
case of Gaussian signals, we restricted the magnitude level 
of the signal elements to x m i n < \xo,i\ < Soxx where 
Xmin — ojfi /4. In addition, we fixed the undersampling ratio 
to M/N = 0.5 because the focus of this paper is to investigate 
the behavior of NSR over SNR. Also, we used Monte Carlo 
method with 200 trials to evaluate the performance in average 
sense. 

A. SER Performance over SNR 

Figj9] compares a SER performance between the proposed 
algorithm and the MAP-approach used in CS-BPs, where the 
BP-process embedded in the both algorithms utilizes the noise 
statistic. We simulated the SER performance as a function of 
SNR for the both signal model. 

1) Result at low SNR: In the low SNR regime, the noise- 
robust property of the proposed BHT-detector provides 2 
dB gain at SER = 10~ 2 from the the MAP-approach for 
the Gaussian model, as we discussed in Section V-A. In 
addition, we note that the case of signed signals shows better 
performance than the case of Gaussian signals with 4 dB gap 
at SER = 10~ 3 . The reason is that in the signed case the 
nonzero values is fixed to X i( z supp (X) = vxi\ therefore, the 
sparse patten of the signal are rarely corrupted by noise and 
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Fig. 10. MSE comparison over SNR for TV = 1024, q = 0.05, M /N = 0.5, 
a Xl = 5, and N d = 256 (T s « 0.1172): Signed signals case. 




SNR <dB> 

Fig. 11. MSE comparison over SNR for N = 1024, q = 0.05, M/N = 0.5, 
crxi = 5> %m.in = 1.25, and = 256 (T 3 £3 0.1172): Gaussian signals 



can be detected with less difficulty than the Gaussian case 
which can have nonzero elements xo,i < ox x ■ 

2) Result at high SNR: As SNR increases, the SER perfor- 
mance of the proposed algorithm shows a waterfall behavior 
whereas that of the MAP-approach shows an error-floor lim- 
ited to SER of 2 x 1CT 3 , for both signal models. In the MAP 
cases, more SNR cannot help when the curve manifests such 
an irreducible SER since the level is saturated by quantization 
effect caused by the sampled-message based BP, as discussed 
in Section V-B. In contrast, the SER curve of the proposed al- 
gorithm declines with more SNR by overcoming quantization 
effect, removing the error-floor of the MAP-approach for both 
signal models. Therefore, this result indicates that the proposed 
BHT-detector can provide the perfect support knowledge to the 
signal value estimator if sufficient SNR level is present. 



B. MSE Comparison over SNR 

This section provides a MSE comparison among the algo- 
rithms in Table |Il1 and the oracle estimator, over SNR, where 
MSE* denotes the performance of the oracle estimator, given 



Tr 



MSE* 



(29) 



Fig 10 and Fig (TT] display the results for the signed signal and 
the Gaussian signal case respectively. 

1 ) Result at low SNR: When the measurements are heavily 
contaminated by noise (below SNR =20 dB), performance 
of all recovery algorithms is basically degraded. Under such 
degradation, BHT-BP and CS-BP-NS outperform the others 
because they are fully employing noise statistic during the 
recovery process, where difference of BHT-BP and CS-BP- 
NS (2 dB SNR gap at MSE = 10~ 3 ) is from types of the 
support detectors as we validated in Section V-A. 

The use of noise statistic remarkably affects the performance 
of BP-based algorithms as discussed in Section IV-A. As a 
support, the standard CS-BP shows 8 — 10 dB SNR loss 
from CS-BP-NS for the both signal models. For SuPrEM, even 
if it also includes noise variance as a input parameter, the 
performance is underperformed since SuPrEM was devised 
mainly for the if-sparse signals which have the support set 
with fixed cardinality. 

As SNR increases, BHT-BP approaches the oracle perfor- 
mance MSE* where the case of signed signal in Fig 10 shows 



faster approaching than the case of Gaussian signals in Fig 1 1 
with approximately 4 dB gap. we note that this gap according 
to the signal models is originated from the gap in the SER 
performance. 

2) Result at high SNR: In the high SNR regime, per- 
formance of the algorithms are generally improved except 
SuPrEM, for the both signal models. Among the algorithms, 
BHT-BP shows the most closely approaching performance to 
MSE*. This is because the BHT-detector provides the perfect 
support knowledge beyond a certain SNR level. Although BCS 
shows a competitive performance within a certain range of 
SNR (SNR = 28 - 40 dB for the signed case, SNR = 30 ~ 
34 dB for the Gaussian case), its performance is saturated to 
a certain level as SNR becomes higher. 

For CS-BPs, the use of the noise statistic in the BP-process 
is no longer be effective beyond a certain SNR level. Indeed, 
the MSE in Fig 10 and Fig{TT] commonly shows that the 
performance of the standard CS-BP converges to that of CS- 
BP-NS beyond approximately SNR = 45 dB. In addition, after 
the convergence the performances are saturated to a certain 
level even with higher SNR. The cause of this saturation is 
the quantization effect, as discussed in Section V-B. Using 
( p7| ), we can calculate the normalized MSE degradation by 
the quantization under this experimental setup, given as 



E ||^iesupp(x) | 



4.5786 x 10~ 5 , 



(30) 



The result in ((30) closely lower bounds the MSE of CS-BPs 
at SNR = 50 dB where the signed and Gaussian cases show 
MSE = 7.973 x 10~ 5 and MSE = 6.554 x 10~ 5 respectively. 
These results exactly explain the performance loss of CS-BPs 
by quantization effect, standing out BHT-BP to remove the 
quantization effect using the DD-structure. 
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VII. Conclusion 

The theoretical and empirical research in this paper demon- 
strated that BHT-BP is a powerful algorithm for NSR from 
a denoising standpoint. In BHT-BP, we employed the DD- 
structure, which consists of support detection and signal value 
estimation. The support detector is designed by a combination 
of the sampled-message based BP and Bayesian hypothesis 
test (BHT), and the signal value estimation performs in MMSE 
sense. 

We have shown that BHT-BP utilizes the posterior knowl- 
edge more efficiently than the MAP-based algorithms, over 
the entire range of SNR. In the low SNR regime, the BHT- 
based support detector provides noisy-robust detection against 
measurement noise. In the high SNR regime, the DD-structure 
eliminates quantization error due to the sampled-message BP 
from the signal value estimation. In addition, we applied the 
knowledge of x m i n to the proposed algorithm based on the 
result of Wainright et al. I34l . l35l . Then, we showed that the 
use of x m in enables BHT-BP to remove the error-floor of the 
MAP-based algorithms, inducing the performance to approach 
that of the oracle estimator as SNR increases. 

We supported such advantages of BHT-BP via experimental 
validation. Our experiments showed that BHT-BP outperforms 
the other recent NSR algorithms over entire range of SNR, 
approaching the recovery performance of the oracle estimator 
as SNR increases. 

Appendix I 

Fundamentals of Density-message Passing 

The goal of BP is to iteratively approximate the marginal 
posterior densities via a message update rule. The message 
update rule can have various forms according to applications 
and frameworks. In the NSR problems, the BP-messages are 
represented as PDFs since the signal is a real-valued. We refer 
to such messages passing as density-messages passing. In this 
appendix, we provide the fundamentals of the density-message 
passing under the BP-SM framework. 

In the discussion here, we consider a random linear model 
corresponding to d5l, given as 



Z = *X + N, 



(31) 



where X and Z are random vectors for x and z respectively, 
and N ~ A/"(0, <7%T) is a random vector for the Gaussian 
noise vector n. In addition, we assume that the sensing 
matrix $ G {0, l} MxJV sufficiently sparse such that the 
corresponding bipartite graph is tree-like. 

Given Z = z, we can represent the marginal posterior 
density of Xi in the form of Posterior = Prior x L,u, ' l,i ' 1 
using the Bayesian rule, given as 



Evidence 



f Xi (x\Z = z) = f x (x) x 



fz(z\Xi = x^ 



(32) 



x 



Jz(z) 

fx(x)x 11 f z .{z\Xi=Xi), (33) 



where we postulate that the measurements associated with JQ, 
i.e., {Zk : k € Ny(i)}, are statistically independent given 
Xi — Xi [22]-|24| using the tree-like property of <fr, to move 
to ((33]) from ([32]). 

We note each decomposition of the likelihood density, i.e., 
fZj{z\Xi = X{) £ Ny(i) called measurement density, 



which is associated with the marginal posteriors of elements 
in {Xk : k E Nc(j), k ^ i}. From ( |3"T| ), a measurement Zj is 
represented by 

Zj = Xi + ^2 X/, + Nj = Xi + Yj , 

keNcU)W} v ; 



where we define Ya 



J2 X k + Nj. Then, by 

k£N c (j)\U} 



factorizing over Yj, the expression of f Zj (z\Xi — Xi) becomes 

f Zj (z\Xi = Xi) = f Zj (x + y\Xi = x^ 

fzj (x + y\Xi = Xi, Yj) f Yj (y\Xi = x t ) dy 



Y > =Sy 3 (y) 

= fz j (z\X i =Xi,Y j )®f Yj (-y), (35) 

where fyAyl-Xi = %i) = fy, (y) since Yj is independent of 
Xi. Since elements in {Xk : k € Nc(j),k ^ i} and Nj 
are statistically independent given Z = z under the tree-like 
assumption |22 |-[24|, we may approximately evaluate the PDF 
of Yj by linear convolution as 

hi (!/)=( (8) fx* W z = z ) I ® fa (»)■ ( 36 > 
\keN c (j)\{i} J 

Finally, by substituting < [36] > into ( (35| we obtain the expression 
of the measurement density given as 

f Zj (z\Xi = x^ = 

f Zj [z\Xi, Yj) ®f N . (n) ® I (g) f Xk (~x\Z = z) , 
' ' \keNcU)\U} J 

(37) 

where /jy. (n) = fa. (— n) owing to the symmetry shape of 
Gaussian densities, and fzJz\Xi,Yj) — 5 Zj is true since no 
uncertainty on Zj exists given Xi,Yj which contain informa- 
tion on the neighbored signal elements. 

Using the derivation in ( |3~3] l and ( |37] ), we can compose the 
update rule for the density-message passing. The practical 
details are provided in Section IV-A, on the basis of the 
sampled-message approach. 
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